AGT-02: Add tool registry, policy evaluator, and inbox triage template by Chris0Jeky · Pull Request #502 · Chris0Jeky/Taskdeck

Chris0Jeky · 2026-03-29T03:31:00Z

Summary

Implements #337 — the second agent infrastructure slice, building on AGT-01's agent profile/run/event foundation.

Domain layer: ToolScope and ToolRiskLevel enums, ITaskdeckTool interface, ITaskdeckToolRegistry interface, PolicyDecision value object with factory methods (AllowWithReview, Deny, AllowDirect)
Application layer: IAgentPolicyEvaluator interface, AgentPolicyEvaluator service (allowlist enforcement, risk-level review gating, auto-apply OFF by default), TaskdeckToolRegistry (in-memory, thread-safe, case-insensitive), InboxTriageAssistant bounded template (gathers inbox items, creates proposals, never directly mutates board state)
Tests: 42 new tests covering tool registration/lookup/scope filtering, policy evaluation (allowlist, risk levels, disabled profiles, auto-apply opt-in), inbox triage (proposal creation, no direct mutation, policy routing, validation)

Key design decisions:

Review-first for ALL risk levels by default (including Low) — auto-apply only with explicit autoApplyLowRisk: true in profile policy JSON
Tool allowlist is opt-in: empty allowlist means all registered tools are available
InboxTriageAssistant routes all changes through the proposal system — zero direct board mutations
PolicyDecision is an immutable value object with clear factory methods

Test plan

dotnet build backend/Taskdeck.sln -c Release passes
dotnet test backend/Taskdeck.sln -c Release -m:1 passes (all existing + 42 new tests)
Architecture tests pass (no layer violations)
CI pipeline passes

…kTool, ITaskdeckToolRegistry interfaces, and PolicyDecision value object Part of #337 (AGT-02). These domain primitives define the tool abstraction, scope/risk classification, and policy decision contract that the Application layer policy evaluator and tool registry build on.

Implements the Application layer for AGT-02 (#337): - IAgentPolicyEvaluator interface for tool-use policy checks - TaskdeckToolRegistry: in-memory, thread-safe tool registry - AgentPolicyEvaluator: allowlist enforcement, risk-level review gating, auto-apply OFF by default for all risk levels - TaskdeckToolDefinition: concrete ITaskdeckTool record - InboxTriageAssistant: bounded template that creates proposals from pending inbox items, never directly mutating board state

…tant 42 tests covering: - ToolRegistryTests: registration, duplicate rejection, lookup, case-insensitive keys, scope filtering - AgentPolicyEvaluatorTests: allowlist enforcement, risk-level review gating (high/medium always review, low review by default), auto-apply opt-in, disabled profile denial, policy JSON parsing - InboxTriageAssistantTests: proposal creation, no direct board mutation, policy routing, validation, edge cases

gemini-code-assist · 2026-03-29T03:31:04Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

chatgpt-codex-connector · 2026-03-29T03:31:05Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Add missing `using` on JsonDocument.Parse to prevent resource leak. Found during adversarial self-review.

Chris0Jeky · 2026-03-29T03:32:07Z

Adversarial Self-Review

Reviewed the full diff critically. Findings:

Fixed

JsonDocument not disposed in AgentPolicyEvaluator.ParsePolicy -- JsonDocument implements IDisposable and was not wrapped in a using statement. Fixed in commit e4792c8.

Noted (acceptable)

AgentPolicyConfig is internal -- This is intentional; it's a parsing implementation detail, not part of the public API. The ParsePolicy method is internal static to allow direct unit testing without exposing the config type publicly.
InboxTriageAssistant uses GetByUserAsync then filters in-memory -- This loads all user LLM requests, not just pending ones. The existing ILlmQueueRepository interface does not expose a combined user+status filter. This is acceptable for now (inbox items per user are bounded) but should be addressed if item counts grow. A GetByUserAndStatusAsync method could be added in a follow-up.
No DI registration -- The new services (TaskdeckToolRegistry, AgentPolicyEvaluator, InboxTriageAssistant) are not wired into the DI container in Program.cs or any startup class. This is intentional for this slice -- DI wiring will be added when the agent API endpoints are created (future issue). The services are fully unit-testable without DI.
No denied_tools / blocklist support -- Policy only supports allowlists, not blocklists. This is simpler and more secure (deny-by-default when allowlist is specified). Can be extended later if needed.

No issues found in

Layer boundaries (Domain has no infra/app dependencies)
Architecture tests pass (8/8)
Naming consistency
Test coverage for edge cases (null keys, empty GUIDs, disabled profiles, malformed JSON)
Review-first invariant (InboxTriageAssistant never calls SaveChangesAsync directly)

Chris0Jeky · 2026-03-29T03:37:19Z

Adversarial Code Review — PR #502 (AGT-02)

I read every line of the diff across all 10 new files. Here are my findings, ordered by severity.

BUG: Missing DI Registration

backend/src/Taskdeck.Api/Extensions/ApplicationServiceRegistration.cs — None of the three new services are registered in the DI container:

ITaskdeckToolRegistry / TaskdeckToolRegistry (should be Singleton — it's an in-memory registry populated at startup)
IAgentPolicyEvaluator / AgentPolicyEvaluator (Scoped, since it depends on IUnitOfWork)
InboxTriageAssistant (Scoped)

Without these registrations, nothing in this PR can be resolved at runtime. Any controller or service depending on these will fail with a System.InvalidOperationException at DI resolution time.

BUG: No Tool Registration at Startup

InboxTriageAssistant.GetToolDefinition() exists but is never called during startup. There is no code that does:

registry.RegisterTool(InboxTriageAssistant.GetToolDefinition());

Without this, _toolRegistry.GetTool("inbox.triage") in the policy evaluator will always return null, and every triage request will be denied with "Tool 'inbox.triage' is not registered."

Both this and the DI registration above need startup wiring — likely in ApplicationServiceRegistration or Program.cs.

DESIGN: PolicyDecision Equality Semantics

backend/src/Taskdeck.Domain/Agents/PolicyDecision.cs — PolicyDecision overrides Equals(object?) and GetHashCode() but:

Does not implement IEquatable<PolicyDecision>
Does not override operator == / operator !=

This means decision1 == decision2 uses reference equality, but decision1.Equals(decision2) uses value equality. This is a classic C# gotcha. If this is intended as a value object (the XML doc says so), it should either:

Be a record (which gets all of this for free), or
Implement IEquatable<PolicyDecision> and override ==/!=

Recommendation: make it public sealed record PolicyDecision with a private constructor and keep the factory methods. Records give you immutability, value equality, ==/!=, ToString(), and deconstruction for free.

DESIGN: ITaskdeckToolRegistry in Domain Layer

backend/src/Taskdeck.Domain/Agents/ITaskdeckToolRegistry.cs — This interface contains void RegisterTool(ITaskdeckTool tool) which is a mutation/service operation. The Domain layer should be free of infrastructure service abstractions per CLAUDE.md ("Core entities and business rules. No infrastructure dependencies — keep it pure.").

Consider: Keep ITaskdeckTool and enums in Domain (they describe business concepts). Move ITaskdeckToolRegistry to Taskdeck.Application.Interfaces — the Application layer is where service interfaces belong. The evaluator already lives in Application and can reference it there.

This doesn't fail architecture tests today because the Domain files don't import forbidden namespaces, but it loosens the Domain purity principle.

MINOR: AgentPolicyConfig Mutability

backend/src/Taskdeck.Application/Services/AgentPolicyEvaluator.cs:168-173 — AgentPolicyConfig.AllowedTools is IReadOnlyList<string> but backed by new List<string>(). A caller could cast and mutate: ((List<string>)config.AllowedTools).Add("malicious.tool"). The default uses Array.Empty<string>() which is truly immutable. For consistency and defense-in-depth, wrap with .AsReadOnly() or use allowedTools.ToArray() in ParsePolicy.

Since this is internal, the practical risk is low.

MINOR: No Test for TruncateTitle Edge Cases

InboxTriageAssistant.TruncateTitle handles multi-line input, empty lines, and length truncation, but there are no direct tests for:

Empty string payload
Payload with only whitespace/newlines
Payload exceeding 200 characters on the first line
Payload that is exactly 200 characters

These are covered indirectly through integration tests but the method deserves its own unit test (it's private static so would need to be made internal with InternalsVisibleTo, or tested through the public API with crafted payloads).

MINOR: Deny Decision Semantics

PolicyDecision.Deny creates new(false, false, reason) — meaning RequiresReview = false on a denied action. Semantically, a denied action doesn't "not require review" — the concept is N/A. This isn't a functional bug (callers check Allowed first), but it could confuse future readers. Consider adding a comment or using a nullable bool? for RequiresReview.

What Looks Good

Thread safety: ConcurrentDictionary with StringComparer.OrdinalIgnoreCase in the registry is correct. TryAdd is atomic. Registration-time validation is clean.
Policy evaluation logic: The cascade (empty ID → unregistered tool → missing profile → disabled profile → allowlist → risk level) is thorough and correctly ordered. High/Medium risk always require review regardless of autoApplyLowRisk.
Review-first default: Low-risk tools default to review-first, auto-apply only with explicit opt-in. This aligns with the project's golden principle.
No direct board mutation: InboxTriageAssistant never calls SaveChangesAsync — all changes go through proposals. The test for this (RunTriage_ShouldNeverDirectlyMutateBoard) is well-designed.
JSON parsing resilience: ParsePolicy handles null, empty, "{}", malformed JSON, and missing properties gracefully, always returning safe defaults.
Test coverage: 42 tests covering all major paths. Mock setup is realistic. Edge cases for policy parsing are good.
Naming and conventions: PascalCase for publics, camelCase for locals, proper C# idioms throughout. Clean XML documentation.
Architecture purity: Application layer does not import Infrastructure or API. Domain files have no forbidden imports.

Verdict

The two DI/startup registration issues are blocking — without them, the entire feature is dead code that will fail at runtime. The PolicyDecision equality issue is a significant design concern that should be addressed before or shortly after merge to avoid future bugs when someone inevitably writes if (decision == expected).

The remaining items are minor and can be addressed in follow-up.

The tool registry, policy evaluator, and inbox triage assistant were missing DI registrations, making the entire feature unresolvable at runtime. Also registers the inbox.triage tool definition at startup so the policy evaluator can find it.

Chris0Jeky added 3 commits March 29, 2026 04:24

github-project-automation bot added this to Taskdeck Execution Mar 29, 2026

github-project-automation bot moved this to Pending in Taskdeck Execution Mar 29, 2026

Fix JsonDocument disposal in AgentPolicyEvaluator.ParsePolicy

e4792c8

Add missing `using` on JsonDocument.Parse to prevent resource leak. Found during adversarial self-review.

Chris0Jeky merged commit 0f38f82 into main Mar 29, 2026
18 checks passed

Chris0Jeky deleted the agt/tool-registry-policy-template branch March 29, 2026 03:46

github-project-automation bot moved this from Pending to Done in Taskdeck Execution Mar 29, 2026

Chris0Jeky mentioned this pull request Apr 1, 2026

AGT-02: Add tool registry, policy evaluator, and bounded inbox-triage template #337

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGT-02: Add tool registry, policy evaluator, and inbox triage template#502

AGT-02: Add tool registry, policy evaluator, and inbox triage template#502
Chris0Jeky merged 5 commits intomainfrom
agt/tool-registry-policy-template

Chris0Jeky commented Mar 29, 2026

Uh oh!

gemini-code-assist bot commented Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Chris0Jeky commented Mar 29, 2026

Summary

Test plan

Uh oh!

gemini-code-assist bot commented Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 29, 2026

Uh oh!

Chris0Jeky commented Mar 29, 2026

Adversarial Self-Review

Fixed

Noted (acceptable)

No issues found in

Uh oh!

Chris0Jeky commented Mar 29, 2026

Adversarial Code Review — PR #502 (AGT-02)

BUG: Missing DI Registration

BUG: No Tool Registration at Startup

DESIGN: PolicyDecision Equality Semantics

DESIGN: ITaskdeckToolRegistry in Domain Layer

MINOR: AgentPolicyConfig Mutability

MINOR: No Test for TruncateTitle Edge Cases

MINOR: Deny Decision Semantics

What Looks Good

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant